Overview

Dataset statistics

Number of variables12
Number of observations782
Missing cells314
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory170.8 KiB
Average record size in memory223.6 B

Variable types

NUM10
CAT2

Reproduction

Analysis started2020-05-14 03:16:38.650017
Analysis finished2020-05-14 03:16:56.397367
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
country_mapped has a high cardinality: 164 distinct values High cardinality
score is highly correlated with rankHigh Correlation
rank is highly correlated with scoreHigh Correlation
dystopia has 312 (39.9%) missing values Missing

Variables

country_mapped
Categorical

HIGH CARDINALITY
UNIFORM
Distinct count164
Unique (%)21.0%
Missing0
Missing (%)0.0%
Memory size6.2 KiB
Lithuania
 
5
Iraq
 
5
Serbia
 
5
Mauritania
 
5
Armenia
 
5
Other values (159)
757
ValueCountFrequency (%) 
Lithuania 5 0.6%
 
Iraq 5 0.6%
 
Serbia 5 0.6%
 
Mauritania 5 0.6%
 
Armenia 5 0.6%
 
Spain 5 0.6%
 
Malta 5 0.6%
 
Tanzania 5 0.6%
 
Rwanda 5 0.6%
 
South Korea 5 0.6%
 
Other values (154) 732 93.6%
 

Length

Max length24
Mean length8.164961637
Min length4
ValueCountFrequency (%) 
Lowercase_Letter 26 49.1%
 
Uppercase_Letter 24 45.3%
 
Open_Punctuation 1 1.9%
 
Space_Separator 1 1.9%
 
Close_Punctuation 1 1.9%
 
ValueCountFrequency (%) 
Latin 50 94.3%
 
Common 3 5.7%
 
ValueCountFrequency (%) 
ASCII 53 100.0%
 

region
Categorical

Distinct count10
Unique (%)1.3%
Missing1
Missing (%)0.1%
Memory size6.2 KiB
Sub-Saharan Africa
195
Central and Eastern Europe
145
Latin America and Caribbean
111
Western Europe
105
Middle East and Northern Africa
96
Other values (5)
129
ValueCountFrequency (%) 
Sub-Saharan Africa 195 24.9%
 
Central and Eastern Europe 145 18.5%
 
Latin America and Caribbean 111 14.2%
 
Western Europe 105 13.4%
 
Middle East and Northern Africa 96 12.3%
 
Southeastern Asia 44 5.6%
 
Southern Asia 35 4.5%
 
Eastern Asia 30 3.8%
 
Australia and New Zealand 10 1.3%
 
North America 10 1.3%
 
(Missing) 1 0.1%
 

Length

Max length31
Mean length21.31585678
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 18 62.1%
 
Uppercase_Letter 9 31.0%
 
Dash_Punctuation 1 3.4%
 
Space_Separator 1 3.4%
 
ValueCountFrequency (%) 
Latin 27 93.1%
 
Common 2 6.9%
 
ValueCountFrequency (%) 
ASCII 29 100.0%
 

year
Real number (ℝ≥0)

Distinct count5
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.9936061381075
Minimum2015
Maximum2019
Zeros0
Zeros (%)0.0%
Memory size6.2 KiB

Quantile statistics

Minimum2015
5-th percentile2015
Q12016
median2017
Q32018
95-th percentile2019
Maximum2019
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.417364432
Coefficient of variation (CV)0.0007027114157
Kurtosis-1.305269806
Mean2016.993606
Median Absolute Deviation (MAD)1.204567605
Skewness0.005903894403
Sum1577289
Variance2.008921934
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[2015. 2015.5 2018.5 2019. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2015 158 20.2%
 
2016 157 20.1%
 
2019 156 19.9%
 
2018 156 19.9%
 
2017 155 19.8%
 
ValueCountFrequency (%) 
2015 158 20.2%
 
2016 157 20.1%
 
2017 155 19.8%
 
2018 156 19.9%
 
2019 156 19.9%
 
ValueCountFrequency (%) 
2019 156 19.9%
 
2018 156 19.9%
 
2017 155 19.8%
 
2016 157 20.1%
 
2015 158 20.2%
 

rank
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
Distinct count158
Unique (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.69820971867007
Minimum1
Maximum158
Zeros0
Zeros (%)0.0%
Memory size6.2 KiB

Quantile statistics

Minimum1
5-th percentile8.05
Q140
median79
Q3118
95-th percentile149
Maximum158
Range157
Interquartile range (IQR)78

Descriptive statistics

Standard deviation45.18238438
Coefficient of variation (CV)0.5741221375
Kurtosis-1.199701126
Mean78.69820972
Median Absolute Deviation (MAD)39.10307363
Skewness0.0004973514565
Sum61542
Variance2041.447859
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 158.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
57 6 0.8%
 
82 6 0.8%
 
34 6 0.8%
 
145 6 0.8%
 
50 5 0.6%
 
56 5 0.6%
 
55 5 0.6%
 
54 5 0.6%
 
53 5 0.6%
 
52 5 0.6%
 
Other values (148) 728 93.1%
 
ValueCountFrequency (%) 
1 5 0.6%
 
2 5 0.6%
 
3 5 0.6%
 
4 5 0.6%
 
5 5 0.6%
 
ValueCountFrequency (%) 
158 1 0.1%
 
157 2 0.3%
 
156 4 0.5%
 
155 5 0.6%
 
154 5 0.6%
 

score
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count716
Unique (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.379017902998669
Minimum2.69300007820129
Maximum7.769
Zeros0
Zeros (%)0.0%
Memory size6.2 KiB

Quantile statistics

Minimum2.693000078
5-th percentile3.58715
Q14.50975
median5.322
Q36.1895
95-th percentile7.31395
Maximum7.769
Range5.075999922
Interquartile range (IQR)1.67975

Descriptive statistics

Standard deviation1.12745646
Coefficient of variation (CV)0.2096026599
Kurtosis-0.7610545866
Mean5.379017903
Median Absolute Deviation (MAD)0.9385068492
Skewness0.03585943327
Sum4206.392
Variance1.27115807
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[2.69300008 3.46199995 4.13899997 6.4885 7.616 7.769 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5.835 3 0.4%
 
6.375 3 0.4%
 
6.379 3 0.4%
 
2.905 3 0.4%
 
5.129 3 0.4%
 
4.35 3 0.4%
 
5.89 3 0.4%
 
5.192 3 0.4%
 
4.36 2 0.3%
 
4.356 2 0.3%
 
Other values (706) 754 96.4%
 
ValueCountFrequency (%) 
2.693000078 1 0.1%
 
2.839 1 0.1%
 
2.853 1 0.1%
 
2.904999971 1 0.1%
 
2.905 3 0.4%
 
ValueCountFrequency (%) 
7.769 1 0.1%
 
7.632 1 0.1%
 
7.6 1 0.1%
 
7.594 1 0.1%
 
7.587 1 0.1%
 

gdp_pc
Real number (ℝ≥0)

Distinct count742
Unique (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9160474824829717
Minimum0.0
Maximum2.096
Zeros5
Zeros (%)0.6%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.208797
Q10.6065
median0.9822047088
Q31.236187109
95-th percentile1.487882078
Maximum2.096
Range2.096
Interquartile range (IQR)0.629687109

Descriptive statistics

Standard deviation0.4073401313
Coefficient of variation (CV)0.4446714161
Kurtosis-0.6927595054
Mean0.9160474825
Median Absolute Deviation (MAD)0.3389627517
Skewness-0.3185805094
Sum716.3491313
Variance0.1659259826
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.25579 0.87243597 1.40584302 1.50933 1.69489883 2.096 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 5 0.6%
 
0.96 4 0.5%
 
0.332 3 0.4%
 
1.34 3 0.4%
 
0.308 2 0.3%
 
1.221 2 0.3%
 
0.642 2 0.3%
 
1.092 2 0.3%
 
1.017 2 0.3%
 
0.274 2 0.3%
 
Other values (732) 755 96.5%
 
ValueCountFrequency (%) 
0 5 0.6%
 
0.0153 1 0.1%
 
0.01604 1 0.1%
 
0.02264318429 1 0.1%
 
0.024 1 0.1%
 
ValueCountFrequency (%) 
2.096 1 0.1%
 
1.870765686 1 0.1%
 
1.82427 1 0.1%
 
1.741943598 1 0.1%
 
1.69752 1 0.1%
 

family
Real number (ℝ≥0)

Distinct count732
Unique (%)93.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0783924825069788
Minimum0.0
Maximum1.644
Zeros5
Zeros (%)0.6%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.46133
Q10.8693625
median1.124735
Q31.32725
95-th percentile1.522
Maximum1.644
Range1.644
Interquartile range (IQR)0.4578875

Descriptive statistics

Standard deviation0.3295483193
Coefficient of variation (CV)0.305592189
Kurtosis0.1584486833
Mean1.078392483
Median Absolute Deviation (MAD)0.2662289482
Skewness-0.6846322898
Sum843.3029213
Variance0.1086020948
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.375 0.7545513 1.55861556 1.644 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 5 0.6%
 
1.125 3 0.4%
 
1.41 3 0.4%
 
1.465 3 0.4%
 
1.438 3 0.4%
 
1.504 3 0.4%
 
1.538 3 0.4%
 
1.487 2 0.3%
 
1.369 2 0.3%
 
1.301 2 0.3%
 
Other values (722) 753 96.3%
 
ValueCountFrequency (%) 
0 5 0.6%
 
0.10419 1 0.1%
 
0.11037 1 0.1%
 
0.13995 1 0.1%
 
0.147 1 0.1%
 
ValueCountFrequency (%) 
1.644 1 0.1%
 
1.624 1 0.1%
 
1.610574007 1 0.1%
 
1.601 1 0.1%
 
1.592 1 0.1%
 

health
Real number (ℝ≥0)

Distinct count705
Unique (%)90.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.612415577116253
Minimum0.0
Maximum1.141
Zeros5
Zeros (%)0.6%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.1578945
Q10.4401825
median0.6473095147
Q30.808
95-th percentile0.954973
Maximum1.141
Range1.141
Interquartile range (IQR)0.3678175

Descriptive statistics

Standard deviation0.2483086404
Coefficient of variation (CV)0.4054577474
Kurtosis-0.487571207
Mean0.6124155771
Median Absolute Deviation (MAD)0.2022435913
Skewness-0.5012025622
Sum478.9089813
Variance0.06165718089
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.00278238 0.1508554 0.573825 0.914455 1.0485 1.141 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.815 5 0.6%
 
0.999 5 0.6%
 
0 5 0.6%
 
0.828 4 0.5%
 
0.874 4 0.5%
 
0.871 3 0.4%
 
0.854 3 0.4%
 
0.861 3 0.4%
 
0.884 3 0.4%
 
0.808 3 0.4%
 
Other values (695) 744 95.1%
 
ValueCountFrequency (%) 
0 5 0.6%
 
0.005564753897 1 0.1%
 
0.01 1 0.1%
 
0.0187726859 1 0.1%
 
0.03824 1 0.1%
 
ValueCountFrequency (%) 
1.141 1 0.1%
 
1.122 1 0.1%
 
1.088 1 0.1%
 
1.062 1 0.1%
 
1.052 1 0.1%
 

freedom
Real number (ℝ≥0)

Distinct count697
Unique (%)89.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4110908258223149
Minimum0.0
Maximum0.7240000000000001
Zeros5
Zeros (%)0.6%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.128096
Q10.3097675
median0.431
Q30.531
95-th percentile0.6308885
Maximum0.724
Range0.724
Interquartile range (IQR)0.2212325

Descriptive statistics

Standard deviation0.1528804206
Coefficient of variation (CV)0.3718896434
Kurtosis-0.3072054061
Mean0.4110908258
Median Absolute Deviation (MAD)0.1242625761
Skewness-0.5212591254
Sum321.4730258
Variance0.02337242301
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.223 0.3545 0.5975 0.66123 0.724 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 5 0.6%
 
0.557 4 0.5%
 
0.454 3 0.4%
 
0.334 3 0.4%
 
0.406 3 0.4%
 
0.394 3 0.4%
 
0.312 3 0.4%
 
0.498 3 0.4%
 
0.431 3 0.4%
 
0.531 3 0.4%
 
Other values (687) 749 95.8%
 
ValueCountFrequency (%) 
0 5 0.6%
 
0.00589 1 0.1%
 
0.01 1 0.1%
 
0.013 1 0.1%
 
0.01499585528 1 0.1%
 
ValueCountFrequency (%) 
0.724 1 0.1%
 
0.696 1 0.1%
 
0.686 1 0.1%
 
0.683 1 0.1%
 
0.681 1 0.1%
 

trust
Real number (ℝ≥0)

Distinct count635
Unique (%)81.3%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean0.12543561357358715
Minimum0.0
Maximum0.55191
Zeros6
Zeros (%)0.8%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.018
Q10.054
median0.091
Q30.15603
95-th percentile0.37124
Maximum0.55191
Range0.55191
Interquartile range (IQR)0.10203

Descriptive statistics

Standard deviation0.1058164476
Coefficient of variation (CV)0.8435917404
Kurtosis1.880108294
Mean0.1254356136
Median Absolute Deviation (MAD)0.07896168061
Skewness1.5208882
Sum97.9652142
Variance0.01119712057
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.082 7 0.9%
 
0 6 0.8%
 
0.034 6 0.8%
 
0.064 6 0.8%
 
0.028 6 0.8%
 
0.078 6 0.8%
 
0.056 5 0.6%
 
0.074 5 0.6%
 
0.055 5 0.6%
 
0.093 5 0.6%
 
Other values (625) 724 92.6%
 
ValueCountFrequency (%) 
0 6 0.8%
 
0.001 1 0.1%
 
0.00227 1 0.1%
 
0.00322 1 0.1%
 
0.004 1 0.1%
 
ValueCountFrequency (%) 
0.55191 1 0.1%
 
0.52208 1 0.1%
 
0.50521 1 0.1%
 
0.4921 1 0.1%
 
0.48357 1 0.1%
 

generosity
Real number (ℝ≥0)

Distinct count664
Unique (%)84.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.21857584156082985
Minimum0.0
Maximum0.8380751609802249
Zeros5
Zeros (%)0.6%
Memory size6.2 KiB

Quantile statistics

Minimum0
5-th percentile0.05403037467
Q10.13
median0.2019822115
Q30.2788325
95-th percentile0.4704543383
Maximum0.838075161
Range0.838075161
Interquartile range (IQR)0.1488325

Descriptive statistics

Standard deviation0.1223207487
Coefficient of variation (CV)0.5596261135
Kurtosis2.020258278
Mean0.2185758416
Median Absolute Deviation (MAD)0.09334568909
Skewness1.044360015
Sum170.9263081
Variance0.01496236557
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.000995 0.0255 0.09796325 0.286285 0.38658294 0.51832 0.83807516], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.175 6 0.8%
 
0 5 0.6%
 
0.187 5 0.6%
 
0.153 5 0.6%
 
0.197 4 0.5%
 
0.099 4 0.5%
 
0.083 4 0.5%
 
0.22 3 0.4%
 
0.137 3 0.4%
 
0.285 3 0.4%
 
Other values (654) 740 94.6%
 
ValueCountFrequency (%) 
0 5 0.6%
 
0.00199 1 0.1%
 
0.01016465668 1 0.1%
 
0.02025 1 0.1%
 
0.025 1 0.1%
 
ValueCountFrequency (%) 
0.838075161 1 0.1%
 
0.81971 1 0.1%
 
0.79588 1 0.1%
 
0.6117045879 1 0.1%
 
0.598 1 0.1%
 

dystopia
Real number (ℝ≥0)

MISSING
Distinct count470
Unique (%)100.0%
Missing312
Missing (%)39.9%
Infinite0
Infinite (%)0.0%
Mean2.092716638021185
Minimum0.32858000000000004
Maximum3.83772
Zeros0
Zeros (%)0.0%
Memory size6.2 KiB

Quantile statistics

Minimum0.32858
5-th percentile1.126492703
Q11.737975
median2.09464
Q32.455574545
95-th percentile3.025559
Maximum3.83772
Range3.50914
Interquartile range (IQR)0.7175995455

Descriptive statistics

Standard deviation0.5657717565
Coefficient of variation (CV)0.2703527779
Kurtosis0.4141306299
Mean2.092716638
Median Absolute Deviation (MAD)0.4396580606
Skewness-0.1216469469
Sum983.5768199
Variance0.3200976805
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.83302 1 0.1%
 
3.31029 1 0.1%
 
2.26646 1 0.1%
 
1.34759 1 0.1%
 
1.99375 1 0.1%
 
2.40364 1 0.1%
 
1.59888 1 0.1%
 
1.322916269 1 0.1%
 
1.784892559 1 0.1%
 
2.43801 1 0.1%
 
Other values (460) 460 58.8%
 
(Missing) 312 39.9%
 
ValueCountFrequency (%) 
0.32858 1 0.1%
 
0.3779137135 1 0.1%
 
0.4193892479 1 0.1%
 
0.5400612354 1 0.1%
 
0.5546331406 1 0.1%
 
ValueCountFrequency (%) 
3.83772 1 0.1%
 
3.60214 1 0.1%
 
3.55906 1 0.1%
 
3.50733 1 0.1%
 
3.40904 1 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

country_mappedregionyearrankscoregdp_pcfamilyhealthfreedomtrustgenerositydystopia
0AfghanistanSouthern Asia20151533.5750.3198200.3028500.3033500.2341400.0971900.3651001.952100
1AfghanistanSouthern Asia20161543.3600.3822700.1103700.1734400.1643000.0711200.3126802.145580
2AfghanistanSouthern Asia20171413.7940.4014770.5815430.1807470.1061800.0611580.3118712.150801
3AfghanistanSouthern Asia20181453.6320.3320000.5370000.2550000.0850000.0360000.191000NaN
4AfghanistanSouthern Asia20191543.2030.3500000.5170000.3610000.0000000.0250000.158000NaN
5AlbaniaCentral and Eastern Europe2015954.9590.8786700.8043400.8132500.3573300.0641300.1427201.898940
6AlbaniaCentral and Eastern Europe20161094.6550.9553000.5016300.7300700.3186600.0530100.1684001.928160
7AlbaniaCentral and Eastern Europe20171094.6440.9961930.8036850.7311600.3814990.0398640.2013131.490442
8AlbaniaCentral and Eastern Europe20181124.5860.9160000.8170000.7900000.4190000.0320000.149000NaN
9AlbaniaCentral and Eastern Europe20191074.7190.9470000.8480000.8740000.3830000.0270000.178000NaN

Last rows

country_mappedregionyearrankscoregdp_pcfamilyhealthfreedomtrustgenerositydystopia
772ZambiaSub-Saharan Africa2015855.1290.4703800.9161200.2992400.4882700.1246800.1959102.634300
773ZambiaSub-Saharan Africa20161064.7950.6120200.6376000.2357300.4266200.1147900.1786602.589910
774ZambiaSub-Saharan Africa20171164.5140.6364071.0031870.2578360.4616030.0782140.2495801.826705
775ZambiaSub-Saharan Africa20181254.3770.5620001.0470000.2950000.5030000.0820000.221000NaN
776ZambiaSub-Saharan Africa20191384.1070.5780001.0580000.4260000.4310000.0870000.247000NaN
777ZimbabweSub-Saharan Africa20151154.6100.2710001.0327600.3347500.2586100.0807900.1898702.441910
778ZimbabweSub-Saharan Africa20161314.1930.3504100.7147800.1595000.2542900.0858200.1850302.442700
779ZimbabweSub-Saharan Africa20171383.8750.3758471.0830960.1967640.3363840.0953750.1891431.597970
780ZimbabweSub-Saharan Africa20181443.6920.3570001.0940000.2480000.4060000.0990000.132000NaN
781ZimbabweSub-Saharan Africa20191463.6630.3660001.1140000.4330000.3610000.0890000.151000NaN